{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "# Data Structure Management \n", "## CREATE Database\n", "\n", "First you need to create your database. The ``` CREATE DATABASE``` statement is used to create a new SQL Database:\n", "\n", "```sql\n", "-- Create a database named `my_first_db`\n", "CREATE DATABASE my_first_db;\n", "```\n", "\n", "Notice that the statement creates a database called my_first_database. Note also two important things about SQL syntax:\n", "\n", "- **comments start with '--':** Just add two dash lines to add a comment, both line comment or in-line comments work this way.\n", "- **statements end with a separator (';')**: In SQL the statements are separated by a special character known as separator. You can change the default separator settings, but for now, let´s just assume that is the character ';'. You need to end each statement with this separator. This is very important if you use a command line client, if you don't enter the statement separator, it will not execute anything!\n", "- **Keywords are capitalized**: Your code will work either ways, but it is a common practice to capitalize statements\n", " \n" ] }, { "cell_type": "markdown", "source": [ "## Show Databases\n", "The command ```SHOW DATABASES``` will show a list of the databases in your database engine:\n", "\n", "```sql\n", "-- list databases\n", "SHOW DATABASES;\n", "```\n" ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "source": [ "## Drop Database\n", "The ```DROP``` database command *drops* the database and completely removes all its data in the database:\n", "\n", "```sql\n", "-- Drop the database from the example above:\n", "DROP DATABASE my_first_db;\n", "```\n", "\n", "The drop command **cannot be reversed**, so special care needs to be taken in production environments!\n" ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "source": [ "## Backup database\n", "In production, you need to backup your data periodically. The best practice is to use a utility tool to backup your data. For instance, ```mysqldump``` is a command line tool used to backup data in Mysql databases. Database backup is a rather complex procedure out of the scope of this course. For more info, refer to the manuals (e.g: [MySQL backup reference manual](https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html))." ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "source": [ "## Use database\n", "Since you might have several databases in the same database server, you need to specify which one you want to use. To do this, you use the ```USE``` database statement:\n", "```sql\n", "-- use database my_first_db\n", "USE my_first_db;\n", "```" ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "source": [ "## Create Table\n", "The ```CREATE TABLE``` creates a new table in a database and specifies the columns in the table:\n", "\n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT, -- Normal Integer. At most 4294967296 different PersonIds\n", " Name VARCHAR(30), -- Name with at most 30 characters\n", " IsAwesome TINYINT(1), -- Either 0 (False) or 1 (True) \n", " AwesomenessLevel DECIMAL(13,3), -- Decimal value with 13 digits of precision, up to 10^10 and with three digits after the decimal point\n", " Gender ENUM('Male', 'Female', 'Other') -- Either male, female, or non binary\n", " );\n", "```\n", "\n", "The ```CREATE TABLE``` includes a comma separated list of the columns in the table, each column definition must include at least the name and the type. The following section includes some important type definitions.\n", "Just as with databases, you can list the created tables with the ```SHOW TABLES``` command:\n", "\n", " ```sql\n", "-- show tables in used database\n", "SHOW TABLES;\n", "```\n", "\n", "and drop a table (warning!!!! you cant reverse this!!):\n", "\n", "```sql\n", "drop table Person;\n", "```\n", "\n" ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "source": [ "## Data Types\n", "Data elements in a relational database have a specific **type**, which is specified in the definition of the table where they are stored. As you will see below, each column in a table needs to have a type and a name, and the type definition needs to be supported by the specific database engine you are using. The sections below describe some data types widely used in **MySQL**:\n", "\n", "In The type type syntax, types may use some optional parameters. For instance Integer numeric types use a parameter, *M* that indicates the maximum number of digits shown in the values returned by the engine. \n", "\n", "The following sections contain some relevant examples:\n", "\n", "### Integer\n", "A normal-size integer. The signed range is -2147483648 to 2147483647. The unsigned range is 0 to 4294967295.\n", "\n", "Integers use **4** of storage space. \n", "\n", "**Example** \n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT -- Normal Integer. At most 4294967296 different PersonIds\n", " );\n", "```\n", "### Big Integer\n", "A big-size Integer. The signed range is -9223372036854775808 to 9223372036854775807. The unsigned range is 0 to 18446744073709551615.\n", "\n", "Big Integers use **8** bytes of storage space. Alternatively, in MySQL, we can provide a parameter *M* that indicates the maximum number of digits shown in the values returned by the engine (it is worth noting that this is a MySQL specific feature).\n", "\n", "**Example** \n", "```mysql\n", "CREATE TABLE Person (\n", " PersonID BIGINT(20) -- Big integer with 20 digits. At most ~10^20 different PersonIds\n", " );\n", "```\n", "\n", "### Tiny Integer\n", "A tiny-size integer. A very small integer. The signed range is -128 to 127. The unsigned range is 0 to 255. A special case is TINYINT(1), which is equivalent to a boolean variable. A value of zero is considered false. Nonzero values are considered true.\n", "\n", "Tiny Integers use **1** byte of storage space.\n", "\n", "\n", "**Example** \n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT,\n", " IsAwesome TINYINT(1) -- Either 0 (False) or 1 (True)\n", " );\n", "```\n", "\n", "### Decimal\n", "Fixed-point number. It has two optional parameters. M is the total number of digits (the precision) and D is the number of digits after the decimal point (the scale). The decimal point and (for negative numbers) the - sign are not counted in M. If D is 0, values have no decimal point or fractional part. The maximum number of digits (M) for DECIMAL is 65. The maximum number of supported decimals (D) is 30. If D is omitted, the default is 0. If M is omitted, the default is 10. Values for DECIMAL columns are stored using a binary format that packs nine decimal digits into 4 bytes. The storage requirements for the integer and fractional parts of each value are determined separately. Each multiple of nine digits requires 4 bytes, and any remaining digits left over require some fraction of 4 bytes.\n", "\n", "**Example** \n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT,\n", " AwesomenessLevel DECIMAL(13,3) -- Decimal value with 13 digits of precision, up to 10^10 and with three digits after the decimal point\n", " );\n", "```\n", "\n", "### Varchar\n", "VARCHAR are variables optimised to store variable-length character arrays or strings. The type is declared with a parameter *M* that indicates the maximum number of characters in a field. \n", "Each field will use a variable number of bytes, one per character in the string. For instance, a value of 'abc' will require 3 bytes and a value of 'ab' 2 bytes.\n", "\n", "**Example** \n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT,\n", " Name VARCHAR(30) -- Name with at most 30 characters\n", " );\n", "```\n", "\n", "### Enum \n", "ENUM is a special type of string variable, where the value can only take one of a set of values specified in the declaration of the type.\n", "The strings you specify as input values are automatically encoded as numbers. \n", "\n", "The storage space needed is **1 or 2 bytes** depending on the values.\n", "\n", "**Example** \n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT,\n", " Gender ENUM('Male', 'Female', 'Other') -- Either male, female, or other\n", " );\n", "```\n", "\n", "### Timestamps\n", "The TIMESTAMP data type is used for values that contain both date and time parts. TIMESTAMP has a range of '1970-01-01 00:00:01' UTC to '2038-01-19 03:14:07' UTC.\n", "\n", "### Other types and some considerations\n", "There are other data types both in standard SQL. For instance, the [Blob](https://dev.mysql.com/doc/refman/8.0/en/blob.html) type and the [JSON](https://dev.mysql.com/doc/refman/8.0/en/json.html) type are two important types not covered in this basic tutorial.\n", "\n", "Also, it is important to consider that each engine extends the standard with proprietary types with specific features. Each developer needs to consider the trade-offs of using each type. For instance, proprietary extensions limit the ability of your development team to change the database engine, so you may find yourself in a **vendor lock-in** situation where it is very expensive to change the engine to an alternative. So, as a rule, developers should choose standard types. However proprietary types can improve the efficiency of your development team (because they make things easier) or the performance of your application (because they make things better!). \n", "A DATETIME or TIMESTAMP value can include a trailing fractional seconds part in up to microseconds (6 digits) precision.\n", "\n", "**Example**\n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT, \n", " Created TIMESTAMP -- Timestamp when the record was created\n", ");\n", "```\n", "\n", "### Default values\n", "Each column can have a default value. If a new row is inserted into the table without specifying a value for a column, the column will take the default value. (We will see how to insert rows in the next section).\n", "\n", "The following example defines a default value for each column except the ID:\n", "\n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT, -- Normal Integer. At most 4294967296 different PersonIds.\n", " Name VARCHAR(30) DEFAULT '', -- Name with at most 30 characters. Default empty string.\n", " IsAwesome TINYINT(1) DEFAULT 0, -- Either 0 (False) or 1 (True). Default 0\n", " AwesomenessLevel DECIMAL(13,3) DEFAULT 12.3, -- Decimal value with 13 digits of precision, up to 10^10 and with three digits after the decimal point. Defaults to 10.3\n", " Gender ENUM('Male', 'Female', 'Other') DEFAULT 'Male', -- Either male, female, or non binary. Default 'Male'\n", " Created TIMESTAMP DEFAULT CURRENT_TIMESTAMP -- timestamp with the date of creation, defaults to current timestamp\n", " );\n", "```" ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "source": [ "## Insert Statement\n", "The insert statement is used to insert new rows into a table. Take the following table as example:\n", "\n", "```sql\n", "CREATE TABLE Person (\n", " PersonID INT,\n", " Name VARCHAR(30), -- Name with at most 30 characters\n", " Created TIMESTAMP -- Timestamp of the last access\n", " );\n", "```\n", "\n", "The following insert statement will insert two records in the table Person with the following values:\n", "\n", "| PersonId | Name | Created |\n", "|----------|------|---------|\n", "|1 | Mark Grayson | 2021-10-05 07:38:16| \n", "| 2 | Eve Wilkins | 2021-10-05 07:45:16|\n", "\n", "**Example**\n", "```sql\n", "-- INSERT INTO table_name (field_lists) VALUES (values1), (values2), ...\n", "INSERT INTO Person (PersonId, Name, Created) VALUES (1,'Mark Grayson', '2021-10-05 07:38:16'), (2,'Eve Wilkins', '2021-10-05 07:45:16');\n", "```\n", "\n", "Note that the syntax requires the table name parameter, the list of column fields, and after the keyword values, a list of comma separated values containing the values of each row within brackets. The values of the fields of each row are also separated with commas and need to be provided **in the right order**.\n", "\n", "The values provided are casted to the types of each column, so you need to use the right **syntax**. Below are some example of the syntax used for common data types\n", "\n", "| Type | Examples |\n", "|---------------|-----------------------|\n", "| INT | 1 |\n", "| VARCHAR(20) | 'John Doe' |\n", "| DECIMAL(10,3) | 3.142 |\n", "| TIMESTAMP | '2021-10-05 00:00:00' |\n", "\n", "### Default values revisited\n", "If you do not provide a value for a field, the default value will be used. For instance, if you do not provide a value for the field Created, the default value will be used.\n", "\n", "The following statement will enter a value with the default timestamp:\n", "\n", "```sql\n", "INSERT INTO \n", " Person (PersonId, Name, IsAwesome, AwesomenessLevel, Gender) \n", "VALUES \n", "(1, 'Mark Grayson', 1, 1000.6, 'Male');\n", "```\n", "\n", "whereas the following example will create another record with just the required field PersonID:\n", "\n", "```sql\n", "INSERT INTO \n", " Person (PersonId) \n", "VALUES \n", "(2);\n", "```" ], "metadata": { "collapsed": false } }, { "cell_type": "markdown", "source": [ "## Select Statement\n", "The select statement is used to retrieve data from tables in our database. We will coming back to the select statement along the tutorials, so consider this just as an introduction to this statement.\n", "\n", "The select syntax is ```SELECT column1, column2, ...FROM table_name;```` that is, first we define the set of columns we want to select and then FROM which table. Let us see with an example:\n", "\n", "```sql\n", "SELECT \n", " PersonId, \n", " Name\n", "FROM Person;\n", "```\n", "\n", "You can use the wildcard character \"*\" to select all columns from the selected records:\n", "\n", "```sql\n", "SELECT \n", " *\n", "FROM Person;\n", "```\n", "\n", "### Table aliases\n", "You can define an **alias** or shorthand to the table name\n", "\n", "### WHERE Statement\n", "In the examples below, we retrieved all the rows from the table, but what if we were only interested in a subset of records? That´s where the WHERE statement comes into play. \n", "The WHERE statement allows us to define a set of logic rules to filter only the records we are interested in. We can combine the logic rules with the logical operands ```AND```, ```OR``` and ```NOT``` \n", "Let us see it in action with some examples:\n", "\n", "This example will only return records where the PersonId is higher than 2:\n", "```sql\n", "SELECT \n", " *\n", "FROM Person\n", "WHERE PersonId > 2;\n", "```\n", "\n", "This example will only return records created after the first of october and that correspond to male persons:\n", "\n", "```sql\n", "SELECT \n", " *\n", "FROM Person\n", "WHERE created > '2021-10-01 00:00:00' AND Gender = 'Male';\n", "```\n", "\n", "This example will return records where the Gender is either male or female:\n", "\n", "```sql\n", "SELECT \n", " *\n", "FROM Person\n", "WHERE Gender IN ('Male', 'Female');\n", "```\n", "\n", "### Delete statement\n", "You can delete records in a table with the ```DELETE STATEMENT```. Use the ```WHERE``` statement to specify which data you want to delete:\n", "\n", "```sql\n", "# Delete all records where Gender is 'Male'\n", "DELETE FROM Person\n", "WHERE Gender = 'Male'\n", "```\n", "\n", "**IMPORTANT**\n", "Do not forget to put the ```WHERE``` in the ```DELETE FROM``` unless you want to delete all records!\n", "### Insert into Select statement\n", "The INSERT INTO SELECT statement selects table from one table and copies it into another table. \n", "The syntax is as follows:\n", "\n", "```sql \n", "INSERT INTO table2 (column1, column2, column3, ...)\n", "SELECT column1, column2, column3, ...\n", "FROM table1\n", "WHERE condition;\n", "```\n", " \n", "As you can see, it combines the syntax of the INSERT INTO statement and the syntax of the SELECT statement.\n", "\n", "### Select into\n", "The select into statement copies selected data from one table into a new table. The syntax is:\n", "```sql\n", "SELECT *\n", "INTO newtable [IN externaldb]\n", "FROM oldtable\n", "WHERE condition;\n", "```" ], "metadata": { "collapsed": false } } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" }, "pycharm": { "stem_cell": { "cell_type": "raw", "source": [], "metadata": { "collapsed": false } } } }, "nbformat": 4, "nbformat_minor": 0 }